class: center, middle, inverse, title-slide .title[ # ISA 444/544: Business Forecasting ] .subtitle[ ## 11: Forecasting Environment and Baseline Forecasts ] .author[ ###
Fadel M. Megahed, PhD
Raymond E. Glos Professor in Business
Farmer School of Business
Miami University
@FadelMegahed
fmegahed
fmegahed@miamioh.edu
Automated Scheduler for Office Hours
] .date[ ### Fall 2025 ] --- ## Learning Objectives for Today's Class - Install and import Nixtla's libraries ([StatsForecast](https://nixtlaverse.nixtla.io/statsforecast/index.html), [MLForecast](https://nixtlaverse.nixtla.io/mlforecast/index.html), [NeuralForecast](https://nixtlaverse.nixtla.io/neuralforecast/docs/getting-started/introduction.html), [UtilsForecast](https://nixtlaverse.nixtla.io/utilsforecast/index.html), and [TimeGPT](https://nixtlaverse.nixtla.io/nixtla/docs/getting-started/introduction.html)) for forecasting - Distinguish fixed window from rolling-origin - Introduce forecast accuracy metrics (MAE, MAPE, RMSE) - Recognize the importance of baseline forecasts in time series forecasting - Identify common baseline forecasts (Naive, Seasonal Naive, etc.) - Implement [baseline models with StatsForecast](https://nixtlaverse.nixtla.io/statsforecast/index.html#baseline-models) --- class: inverse, center, middle # The Nixtlaverse Open-Source Forecasting Libraries --- ## Nixtla's Forecasting Libraries <img src="data:image/png;base64,#../../figures/nixtlaverse.png" width="87%" style="display: block; margin: auto;" /> --- ## Nixtla's Forecasting Libraries Nixtla provides **several open-source Python libraries** (and a closed source **TimeGPT** tool accessible via API calls) for **scalable forecasting tasks**. These libraries are **relatively** easy to use and can be integrated into your future forecasting workflows: - **[StatsForecast](https://nixtlaverse.nixtla.io/statsforecast/index.html)** – Fast & scalable statistical models (`ARIMA`, `ETS`, etc.). - **[MLForecast](https://nixtlaverse.nixtla.io/mlforecast/index.html)** – Machine learning-based forecasting (e.g., `XGBoost`, `LightGBM`). - **[NeuralForecast](https://nixtlaverse.nixtla.io/neuralforecast/docs/getting-started/introduction.html)** – Deep learning models for time series (e.g., `NBEATS`, `NHITS`, and `TFT`). - **[UtilsForecast](https://nixtlaverse.nixtla.io/utilsforecast/index.html)** – Utility functions for plotting, evaluation, etc. - **[TimeGPT](https://nixtlaverse.nixtla.io/nixtla/docs/getting-started/introduction.html)** – an AI transformer-powered forecasting API that requires minimal tuning. These libraries enable **forecasting at scale, which you will need in practice**. .footnote[ <html> <hr> </html> **Note:** These libraries can be installed via `pip` and are described in detail in [nixtlaverse.nixtla.io](https://nixtlaverse.nixtla.io/). Use the left-hand navigation bar to explore the documentation for each library. ] --- class: inverse, center, middle # Fixed Window vs. Rolling-Origin --- ## The Fixed Window Evaluation Approach - **Fixed Window** is the simplest approach to splitting your time series data into a training and a testing/holdout set. - The **goal** is to **train your model on the training set** and **evaluate its performance on the testing set (the last `k` observations in the data)**. + .black[.bold[Note that this is quite different than traditional machine learning applications for cross sectional data.]] - The evaluation on the **testing/holdout** set can serve two purposes: + **Model Evaluation**: Assess the model's **performance on unseen data** (since it is not used during training, and hence acts as a proxy for the model's performance on future data). + **Model Selection**: Compare the **performance of different models** to select the **best one**. - Note that using this approach for model selection is **reasonable if your models do not involve hyperparameter tuning** (otherwise, you may overfit to the testing set). --- ## The Fixed Window Evaluation Approach <img src="data:image/png;base64,#11_forecasting_env_baseline_forecasts_files/figure-html/fixed_window-1.png" width="100%" style="display: block; margin: auto;" /> --- ## The Rolling-Origin Evaluation Approach - **Rolling-Origin Evaluation** is a method for splitting time series data into training and testing sets where the testing sets move forward over time. - The **goal** is to **train the model on a subset of past observations** and evaluate its **performance on a future testing set at multiple time steps**. - The key difference from a fixed window approach is that the **testing set shifts forward, allowing for multiple evaluations**: + The **training set may expand (expanding window)** or **remain fixed (rolling window)**. + This ensures that model performance is assessed across **different points in time**. + It **reduces sensitivity to the initial split point** and provides a more **robust evaluation** of model performance over time. --- ## Expanding Window Evaluation (By 1 Month) <img src="data:image/png;base64,#../../figures/expanding_window_evaluation1mo.gif" width="100%" style="display: block; margin: auto;" /> --- ## Expanding Window Evaluation (By 12 Month) <img src="data:image/png;base64,#../../figures/expanding_window_evaluation12mo.gif" width="100%" style="display: block; margin: auto;" /> --- ## Rolling Non-Expanding Window Evaluation <img src="data:image/png;base64,#../../figures/nonexpanding_window_evaluation.gif" width="100%" style="display: block; margin: auto;" /> --- ## Cross Validation within the Nixtlaverse <img src="data:image/png;base64,#../../figures/ts_cross_validation_nixtla.png" width="63%" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Note:** The `cross_validation` method can also be applied to other NixtlaForecast objects (e.g., `MLForecast`, `NeuralForecast`) to perform cross-validation for machine learning and deep learning models. See the [StatsForecast Cross Validation Tutorial](https://nixtlaverse.nixtla.io/statsforecast/docs/tutorials/crossvalidation.html) to access the page shown above. ] --- ## Activity: The `cross_validation` Method <div style='position: relative; padding-bottom: 56.25%; padding-top: 35px; height: 0; overflow: hidden;'><iframe sandbox='allow-scripts allow-same-origin allow-presentation' allowfullscreen='true' allowtransparency='true' frameborder='0' height='315' src='https://www.mentimeter.com/app/presentation/alph1gvzvcf291iyb9typ19e28i46ptv/embed' style='position: absolute; top: 0; left: 0; width: 100%; height: 100%;' width='420'></iframe></div> --- ## Recap of Fixed vs. Rolling-Origin - **Fixed Window**: + **Simplest approach** to splitting data into training and testing sets. + **Testing set is fixed** and **does not move forward** over time. + **Provides a single evaluation** of model performance. - **Rolling-Origin**: + **Testing set moves forward** over time. + **Training set may expand or remain fixed**. + **Provides multiple evaluations** of model performance. + **Reduces sensitivity to the initial split point**. + **More robust evaluation** of model performance over time. --- ## Recap of Fixed vs. Rolling-Origin - In practice, the **rolling-origin approach is preferred** for time series forecasting tasks since it mimics the real-world scenario of forecasting future data points. The choice of: + **Expanding vs. Rolling Window**, + **Window Size**, and + **Step Size** depends on the specific forecasting task and the data at hand. --- class: inverse, center, middle # Forecast Accuracy Metrics --- ## Model Performance Evaluation in the Nixtlaverse ``` python from utilsforecast.losses import * ``` ### `evaluate` ``` python evaluate (df:~AnyDFType, metrics:List[Callable], models:Optional[List[str]]=None, train_df:Optional[~AnyDFType]=None, level:Optional[List[int]]=None, id_col:str='unique_id', time_col:str='ds', target_col:str='y', agg_fn:Optional[str]=None) ``` .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Evaluation Documentation](https://nixtlaverse.nixtla.io/utilsforecast/evaluation.html) ] --- ## Model Performance Evaluation in the Nixtlaverse (Cont.) .font80[ <table><thead><tr><th></th><th><strong>Type</strong></th><th><strong>Default</strong></th><th><strong>Details</strong></th></tr></thead><tbody><tr><td>df</td><td>AnyDFType</td><td></td><td>Forecasts to evaluate.<br/>Must have <code>id_col</code>, <code>time_col</code>, <code>target_col</code> and models’ predictions.</td></tr><tr><td>metrics</td><td>List</td><td></td><td>Functions with arguments <code>df</code>, <code>models</code>, <code>id_col</code>, <code>target_col</code> and optionally <code>train_df</code>.</td></tr><tr><td>models</td><td>Optional</td><td>None</td><td>Names of the models to evaluate.<br/>If <code>None</code> will use every column in the dataframe after removing id, time and target.</td></tr><tr><td>train_df</td><td>Optional</td><td>None</td><td>Training set. Used to evaluate metrics such as <a href="https://Nixtla.github.io/utilsforecast/losses.html#mase" target="_blank" rel="noreferrer"><code>mase</code></a>.</td></tr><tr><td>level</td><td>Optional</td><td>None</td><td>Prediction interval levels. Used to compute losses that rely on quantiles.</td></tr><tr><td>id_col</td><td>str</td><td>unique_id</td><td>Column that identifies each serie.</td></tr><tr><td>time_col</td><td>str</td><td>ds</td><td>Column that identifies each timestep, its values can be timestamps or integers.</td></tr><tr><td>target_col</td><td>str</td><td>y</td><td>Column that contains the target.</td></tr><tr><td>agg_fn</td><td>Optional</td><td>None</td><td>Statistic to compute on the scores by id to reduce them to a single number.</td></tr><tr><td><strong>Returns</strong></td><td><strong>AnyDFType</strong></td><td></td><td><strong>Metrics with one row per (id, metric) combination and one column per model.<br/>If <code>agg_fn</code> is not <code>None</code>, there is only one row per metric.</strong></td></tr></tbody></table> ] .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Evaluation Documentation](https://nixtlaverse.nixtla.io/utilsforecast/evaluation.html) ] --- ## Losses The most important train signal is the forecast error, which is the difference between the observed value `\(y_{\tau}\)` and the prediction `\(\hat{y}_{\tau}\)`, at time `\(y_{\tau}\)`: `$$e_{\tau} = y_{\tau} - \hat{y}_{\tau} \quad \quad \tau \in \{t+1, \dots, t+H\}$$` The train loss summarizes the forecast errors in different evaluation metrics. .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Losses Documentation](https://nixtlaverse.nixtla.io/utilsforecast/losses.html) ] --- ## Scale-Dependent Errors: `mae` **MAE** measures the relative prediction accuracy by averaging the absolute deviations between forecasts and actual values. `$$\text{MAE}(y_{\tau}, \hat{y}_{\tau}) = \frac{1}{H} \sum_{\tau = t+1}^{t+H} |y_{\tau} - \hat{y}_{\tau}|$$` - **Interpretation**: Provides a straightforward measure of forecast accuracy; lower MAE indicates better performance. - **Characteristic:** Does not penalize larger errors more than smaller ones; treats all errors equally. .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Losses Documentation](https://nixtlaverse.nixtla.io/utilsforecast/losses.html) ] --- ## Scale-Dependent Errors: `rmse` **RMSE** is the square root of the average of the squared differences between forecasts and actual values. `$$\text{RMSE}(y_{\tau}, \hat{y}_{\tau}) = \sqrt{\frac{1}{H} \sum_{\tau = t+1}^{t+H} (y_{\tau} - \hat{y}_{\tau})^2}$$` - **Interpretation**: Emphasizes larger errors due to squaring; useful when large errors are particularly undesirable. - **Characteristic:** Penalizes large errors more than MAE (i.e., more sensitive to ourliers compared to MAE). .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Losses Documentation](https://nixtlaverse.nixtla.io/utilsforecast/losses.html) ] --- ## Percentage Errors: `mape` **MAPE** calculates the average absolute error as a percentage of actual values. `$$\text{MAPE}(y_{\tau}, \hat{y}_{\tau}) = \frac{1}{H} \sum_{\tau = t+1}^{t+H} \left| \frac{y_{\tau} - \hat{y}_{\tau}}{y_{\tau}} \right|$$` **Interpretation**: Expresses forecast accuracy as a percentage; lower MAPE indicates better performance. **Characteristic:** Can be misleading if actual values are close to zero, leading to extremely high MAPE values. .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Losses Documentation](https://nixtlaverse.nixtla.io/utilsforecast/losses.html) ] --- class: inverse, center, middle # Importance of Baseline Forecasts --- ## Why do we need Baseline Forecasts? .pull-left[ - Baseline forecasts are **simple, yet powerful**. - Baseline forecasts serve as **important benchmarks**: - They **provide a performance "floor"** for more complex models. - Help **assess if advanced models can add meaningful value**. - They are **easy to interpret** and **require minimal computational resources**. ] .pull-right[ <img src="data:image/png;base64,#../../figures/flowchart_baseline.png" width="100%" style="display: block; margin: auto;" /> ] --- class: inverse, center, middle # Common Baseline Forecasts --- ## Historic Mean/Average - **Historic Mean/Average**: The simplest baseline forecast is to use the average of the historical data as the forecast for all future periods. - **Mathematical Representation**: `\(\hat{y}_{t+h} = \frac{1}{t} \sum_{i=1}^{t} y_i\)` - **Assumptions about the Data Generating Process (DGP)**: - Future values will be similar to the past. - The data generating process is **relatively stationary (fixed mean with random noise)**, with **no seasonality** and **no trends** (obviously since the mean is fixed). --- ## Historic Mean/Average: Fixed Window <img src="data:image/png;base64,#11_forecasting_env_baseline_forecasts_files/figure-html/historic_average-1.png" width="100%" style="display: block; margin: auto;" /><img src="data:image/png;base64,#11_forecasting_env_baseline_forecasts_files/figure-html/historic_average-2.png" width="100%" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Notes**: The plot above shows the historic average forecast for the retail trade data. The light gray region represents the training data, while the red region represents the test data. The vertical dashed line indicates the split point between the training and test data. The training data is used to fit the model, while the test data is used to evaluate the model's performance. **Obviously, the historic average forecast is a straight line, as it is the same for all future periods. Furthermore, it is a terrible choice for this dataset since it clearly has a positive trend.** ] --- ## Naive Forecast - **Naive Forecast**: The naive forecast assumes that the future value will be the same as the last observed value. - **Mathematical Representation**: `\(\hat{y}_{t+h} = y_t\)` - **Assumptions about the Data Generating Process (DGP)**: - Future values will be similar to the most recent past value. - The data generating process can either be **relatively stationary** or **follow a random walk**. --- ## Naive Forecast: Fixed Window <img src="data:image/png;base64,#11_forecasting_env_baseline_forecasts_files/figure-html/naive_forecast-5.png" width="100%" style="display: block; margin: auto;" /><img src="data:image/png;base64,#11_forecasting_env_baseline_forecasts_files/figure-html/naive_forecast-6.png" width="100%" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Notes**: The plot above shows the naive forecast for the retail trade data. The light gray region represents the training data, while the red region represents the test data. The vertical dashed line indicates the split point between the training and test data. The training data is used to fit the model, while the test data is used to evaluate the model's performance. **The naive forecast lags the training data, and is a straightline for the holdout dataset since we do not update the naive forecast for the fixed window approach.** **Similar to the historic average forecast, the naive forecast is a terrible choice for this dataset since it clearly has a positive trend.** ] --- ## Seasonal Naive Forecast - **Seasonal Naive Forecast**: The seasonal naive forecast assumes that the future value will be the same as the last observed value from the same season. - **Mathematical Representation**: `\(\hat{y}_{t+h} = y_{t+h-m}\)`, where `\(m\)` is the seasonal period. - **Assumptions about the Data Generating Process (DGP)**: - Future values will be similar to the most recent past value from the same season. - The data generating process will be **seasonal but with no trends**. --- ## Seasonal Naive Forecast: Fixed Window <img src="data:image/png;base64,#11_forecasting_env_baseline_forecasts_files/figure-html/seasonal_naive_forecast-9.png" width="100%" style="display: block; margin: auto;" /><img src="data:image/png;base64,#11_forecasting_env_baseline_forecasts_files/figure-html/seasonal_naive_forecast-10.png" width="100%" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Notes**: The plot above shows the seasonal naive forecast for the retail trade data. The light gray region represents the training data, while the red region represents the test data. The vertical dashed line indicates the split point between the training and test data. The training data is used to fit the model, while the test data is used to evaluate the model's performance. **The seasonal naive forecast lags the training data, and is worse than the naive model (since now the data lags by 12 months instead of 1 month); it would be better if the data is seasonal with no trends.** ] --- class: inverse, center, middle # Implementing Baseline Models with StatsForecast --- ## Live Demo We will now implement the historic average, naive, and seasonal naive forecasts using the [StatsForecast]() library. Our in-class implementation will highlight both the fixed window and rolling-origin approaches. --- class: inverse, center, middle # Recap --- ## Summary of Main Points By now, you should be able to do the following: - Install and import Nixtla's libraries ([StatsForecast](https://nixtlaverse.nixtla.io/statsforecast/index.html), [MLForecast](https://nixtlaverse.nixtla.io/mlforecast/index.html), [NeuralForecast](https://nixtlaverse.nixtla.io/neuralforecast/docs/getting-started/introduction.html), [UtilsForecast](https://nixtlaverse.nixtla.io/utilsforecast/index.html), and [TimeGPT](https://nixtlaverse.nixtla.io/nixtla/docs/getting-started/introduction.html)) for forecasting - Distinguish fixed window from rolling-origin - Introduce forecast accuracy metrics (MAE, MAPE, RMSE) - Recognize the importance of baseline forecasts in time series forecasting - Identify common baseline forecasts (Naive, Seasonal Naive, etc.) - Implement [baseline models with StatsForecast](https://nixtlaverse.nixtla.io/statsforecast/index.html#baseline-models) --- ## 📝 Review and Clarification 📝 1. **Class Notes**: Take some time to revisit your class notes for key insights and concepts. 2. **Zoom Recording**: The recording of today's class will be made available on Canvas approximately 3-4 hours after the session ends. 3. **Questions**: Please don't hesitate to ask for clarification on any topics discussed in class. It's crucial not to let questions accumulate.